International Journal of Data Science and Big Data Analytics
|
Volume 3, Issue 1, May 2023 | |
Research PaperOpenAccess | |
An Optimal WordNet Based Emotional Word Extraction and Hybrid Deep Learning Classifier for Sentiment Analysis |
|
1School of Computing & Information Technology Jomo Kenyatta University of Agriculture and Technology Nairobi, Kenya. E-mail: smobareo@gmail.com
*Corresponding Author | |
Int.J.Data.Sci. & Big Data Anal. 3(1) (2023) 25-44, DOI: https://doi.org/10.51483/IJDSBDA.3.1.2023.25-44 | |
Received: 24/08/2022|Accepted: 17/03/2023|Published: 05/05/2023 |
Everyday, over one billion social media text messages are generated worldwide, which is a rich source of data that can lead to improvements in the lives of citizens through evidence-based decision making. Twitter is rich in such data but there are number of challenges in processing tweets with respect to volume, speed, ambiguity of the language in which tweets are written, which is an information extraction problem, in the domain of Natural Language Processing (NLP). While there is a growing interest in sentiment analysis for detecting emotions from tweets, there are no major efforts for detecting emotions that are disguised in tweets based on context of word usage, which is important for tasks such as identification of events such as hate speech, mental health related disorders. This paper presents a novel approach to context-based hate speech detection based on an optimal WordNet. Taking a modified Mayfly Optimization algorithm (MMO), we pre-process tweets and normalize the data using an NLP pipeline. We argument an improved Horse Herd Optimization algorithm (HHO) with WordNet and SentiWordNet to compute a tweet sentiment polarity. A hybrid Deep Belief Artificial Neural Network (hybrid DB-ANN) is then used to classify tweets. The performance of the proposed approach is compared to best known sentiment analysis algorithms using three standard benchmark datasets: Crowdflower-1, Crowdflower-2, and Kaggle Twitter. We demonstrate that the proposed approach outperforms industry benchmarks as it achieves over 90% in terms of accuracy, precision, recall, and F-measure.
Keywords: Natural Language Processing, Sentiment Analysis, WordNet, SentiWordNet, Tweet classification, Emotional words
Full text | Download |
Copyright © SvedbergOpen. All rights reserved